NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Human Heterogeneity Invariant Stress Sensing

https://doi.org/10.1145/3749465

Xiao, Yi; Sharma, Harshit; Kaur, Sawinder; Bergen-Cico, Dessa; Salekin, Asif (September 2025, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

Stress affects physical and mental health, and wearable devices have been widely used to detect daily stress through physiological signals. However, these signals vary due to factors such as individual differences and health conditions, making generalizing machine learning models difficult. To address these challenges, we present Human Heterogeneity Invariant Stress Sensing (HHISS), a domain generalization approach designed to find consistent patterns in stress signals by removing person-specific differences. This helps the model perform more accurately across new people, environments, and stress types not seen during training. Its novelty lies in proposing a novel technique called person-wise sub-network pruning intersection to focus on shared features across individuals, alongside preventing overfitting by leveraging continuous labels while training. The present study focuses on people with opioid use disorder (OUD)---a group where stress responses can change dramatically depending on the presents of opioids in their system, including daily timed medication for OUD (MOUD). Since stress often triggers cravings, a model that can adapt well to these changes could support better OUD rehabilitation and recovery. We tested HHISS on seven different stress datasets---four which we collected ourselves and three public datasets. Four are from lab setups, one from a controlled real-world driving setting, and two are from real-world in-the-wild field datasets with no constraints. The present study is the first known to evaluate how well a stress detection model works across such a wide range of data. Results show HHISS consistently outperformed state-of-the-art baseline methods, proving both effective and practical for real-world use. Ablation studies, empirical justifications, and runtime evaluations confirm HHISS's feasibility and scalability for mobile stress sensing in sensitive real-world applications.
more » « less
Free, publicly-accessible full text available September 3, 2026
Psychophysiology-aided Perceptually Fluent Speech Analysis of Children Who Stutter

https://doi.org/10.1145/3716550.3722019

Xiao, Yi; Sharma, Harshit; Tumanova, Victoria; Salekin, Asif (May 2025, ACM)

Free, publicly-accessible full text available May 6, 2026
CRoP: Context-wise Robust Static Human-Sensing Personalization

https://doi.org/10.1145/3729483

Kaur, Sawinder; Gump, Avery; Xiao, Yi; Xin, Jingyu; Sharma, Harshit; Benway, Nina R; Preston, Jonathan L; Salekin, Asif (June 2025, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

The advancement in deep learning and internet-of-things have led to diverse human sensing applications. However, distinct patterns in human sensing, influenced by various factors or contexts, challenge the generic neural network model's performance due to natural distribution shifts. To address this, personalization tailors models to individual users. Yet most personalization studies overlook intra-user heterogeneity across contexts in sensory data, limiting intra-user generalizability. This limitation is especially critical in clinical applications, where limited data availability hampers both generalizability and personalization. Notably, intra-user sensing attributes are expected to change due to external factors such as treatment progression, further complicating the challenges. To address the intra-user generalization challenge, this work introduces CRoP, a novel static personalization approach. CRoP leverages off-the-shelf pre-trained models as generic starting points and captures user-specific traits through adaptive pruning on a minimal sub-network while allowing generic knowledge to be incorporated in remaining parameters. CRoP demonstrates superior personalization effectiveness and intra-user robustness across four human-sensing datasets, including two from real-world health domains, underscoring its practical and social impact. Additionally, to support CRoP's generalization ability and design choices, we provide empirical justification through gradient inner product analysis, ablation studies, and comparisons against state-of-the-art baselines.
more » « less
Free, publicly-accessible full text available June 9, 2026
Reading Between the Heat: Co-Teaching Body Thermal Signatures for Non-intrusive Stress Detection

https://doi.org/10.1145/3631441

Xiao, Yi; Sharma, Harshit; Zhang, Zhongyang; Bergen-Cico, Dessa; Rahman, Tauhidur; Salekin, Asif (December 2023, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

Stress impacts our physical and mental health as well as our social life. A passive and contactless indoor stress monitoring system can unlock numerous important applications such as workplace productivity assessment, smart homes, and personalized mental health monitoring. While the thermal signatures from a user's body captured by a thermal camera can provide important information about the fight-flight response of the sympathetic and parasympathetic nervous system, relying solely on thermal imaging for training a stress prediction model often lead to overfitting and consequently a suboptimal performance. This paper addresses this challenge by introducing ThermaStrain, a novel co-teaching framework that achieves high-stress prediction performance by transferring knowledge from the wearable modality to the contactless thermal modality. During training, ThermaStrain incorporates a wearable electrodermal activity (EDA) sensor to generate stress-indicative representations from thermal videos, emulating stress-indicative representations from a wearable EDA sensor. During testing, only thermal sensing is used, and stress-indicative patterns from thermal data and emulated EDA representations are extracted to improve stress assessment. The study collected a comprehensive dataset with thermal video and EDA data under various stress conditions and distances. ThermaStrain achieves an F1 score of 0.8293 in binary stress classification, outperforming the thermal-only baseline approach by over 9%. Extensive evaluations highlight ThermaStrain's effectiveness in recognizing stress-indicative attributes, its adaptability across distances and stress scenarios, real-time executability on edge platforms, its applicability to multi-individual sensing, ability to function on limited visibility and unfamiliar conditions, and the advantages of its co-teaching approach. These evaluations validate ThermaStrain's fidelity and its potential for enhancing stress assessment.
more » « less
Full Text Available
"Reading Between the Heat": Co-Teaching Body Thermal Signatures for Non-intrusive Stress Detection

Xiao, Yi; Sharma, Harshit; Zhang, Zhongyang; Bergen-Cico, Dessa; Rahman, Tauhidur; Salekin, Asif (October 2023, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

Stress impacts our physical and mental health as well as our social life. A passive and contactless indoor stress monitoring system can unlock numerous important applications such as workplace productivity assessment, smart homes, and personalized mental health monitoring. While the thermal signatures from a user’s body captured by a thermal camera can provide important information about the “fight-flight” response of the sympathetic and parasympathetic nervous system, relying solely on thermal imaging for training a stress prediction model often lead to overfitting and consequently a suboptimal performance. This paper addresses this challenge by introducing ThermaStrain, a novel co-teaching framework that achieves high-stress prediction performance by transferring knowledge from the wearable modality to the contactless thermal modality. During training, ThermaStrain incorporates a wearable electrodermal activity (EDA) sensor to generate stress-indicative representations from thermal videos, emulating stress-indicative representations from a wearable EDA sensor. During testing, only thermal sensing is used, and stress-indicative patterns from thermal data and emulated EDA representations are extracted to improve stress assessment. The study collected a comprehensive dataset with thermal video and EDA data under various stress conditions and distances. ThermaStrain achieves an F1 score of 0.8293 in binary stress classification, outperforming the thermal-only baseline approach by over 9%. Extensive evaluations highlight ThermaStrain’s effectiveness in recognizing stress-indicative attributes, its adaptability across distances and stress scenarios, real-time executability on edge platforms, its applicability to multi-individual sensing, ability to function on limited visibility and unfamiliar conditions, and the advantages of its co-teaching approach. These evaluations validate ThermaStrain’s fidelity and its potential for enhancing stress assessment.
more » « less
Full Text Available
Privacy against Real-Time Speech Emotion Detection via Acoustic Adversarial Evasion of Machine Learning

https://doi.org/10.1145/3610887

Testa, Brian; Xiao, Yi; Sharma, Harshit; Gump, Avery; Salekin, Asif (September 2023, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

Smart speaker voice assistants (VAs) such as Amazon Echo and Google Home have been widely adopted due to their seamless integration with smart home devices and the Internet of Things (IoT) technologies. These VA services raise privacy concerns, especially due to their access to our speech. This work considers one such use case: the unaccountable and unauthorized surveillance of a user's emotion via speech emotion recognition (SER). This paper presents DARE-GP, a solution that creates additive noise to mask users' emotional information while preserving the transcription-relevant portions of their speech. DARE-GP does this by using a constrained genetic programming approach to learn the spectral frequency traits that depict target users' emotional content, and then generating a universal adversarial audio perturbation that provides this privacy protection. Unlike existing works, DARE-GP provides: a) real-time protection of previously unheard utterances, b) against previously unseen black-box SER classifiers, c) while protecting speech transcription, and d) does so in a realistic, acoustic environment. Further, this evasion is robust against defenses employed by a knowledgeable adversary. The evaluations in this work culminate with acoustic evaluations against two off-the-shelf commercial smart speakers using a small-form-factor (raspberry pi) integrated with a wake-word system to evaluate the efficacy of its real-world, real-time deployment.
more » « less
Psychophysiological Arousal in Young Children Who Stutter: An Interpretable AI Approach

https://doi.org/10.1145/3550326

Sharma, Harshit; Xiao, Yi; Tumanova, Victoria; Salekin, Asif (September 2022, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

The presented first-of-its-kind study effectively identifies and visualizes the second-by-second pattern differences in the physiological arousal of preschool-age children who do stutter (CWS) and who do not stutter (CWNS) while speaking perceptually fluently in two challenging conditions: speaking in stressful situations and narration. The first condition may affect children's speech due to high arousal; the latter introduces linguistic, cognitive, and communicative demands on speakers. We collected physiological parameters data from 70 children in the two target conditions. First, we adopt a novel modality-wise multiple-instance-learning (MI-MIL) approach to classify CWS vs. CWNS in different conditions effectively. The evaluation of this classifier addresses four critical research questions that align with state-of-the-art speech science studies' interests. Later, we leverage SHAP classifier interpretations to visualize the salient, fine-grain, and temporal physiological parameters unique to CWS at the population/group-level and personalized-level. While group-level identification of distinct patterns would enhance our understanding of stuttering etiology and development, the personalized-level identification would enable remote, continuous, and real-time assessment of stuttering children's physiological arousal, which may lead to personalized, just-in-time interventions, resulting in an improvement in speech fluency. The presented MI-MIL approach is novel, generalizable to different domains, and real-time executable. Finally, comprehensive evaluations are done on multiple datasets, presented framework, and several baselines that identified notable insights on CWSs' physiological arousal during speech production.
more » « less

Search for: All records